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DETAILED ACTION 
Claim Rejections - 35 USC § 103 

1 . The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for 
all obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described 
as set forth In section 102 of this title, If the differences between the subject matter sought to 
be patented and the prior art are such that the subject matter as a whole would have been 
obvious at the time the invention was made to a person having ordinary skill in the art to which 
said subject matter pertains. Patentability shall not be negatived by the manner in which the 
invention was made. 

2. Claims 1 , 4, 7, 9, 1 2, 1 5-1 6, 1 9, 22-23 and 32 are rejected under 35 
U.S.C. 103(a) as being unpatentable over Okazaki (US Patent: 5,666,555) in 
view of Suzuki et al. (US Patent: 5,736,982) 

As to claim 1 , Okazaki discloses a multichannel information processing 
device (i.e. the multiple video signal coming into the system like VTR and LD) 
(see Fig. 1, Col. 2, Lines 41-57) wherein a plurality of video images are displayed 
simultaneously on a display device (i.e. multiple windows on the display screen 
of the computer displaying separate video signals) (see Fig. 1, Col. 2, Lines 58- 
68), comprising: 

video image information control means (103 CPU) for acquiring 
information for said plurality of video images, and for deciding video image 
position information relating to display positions on a display device (108) for said 
plurality of video images (i.e. images coming from 101 image reproduction 
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apparatus) and outputting said information for a plurality of video images based 
on said video image position information (i.e. since the CPU process tlie video 
image information and assign tliese information on to a plurality of windows on 
the Bit Map Display 108, it must assign a position value for each of the windows 
in order to display the each of the video images correctly) (see Fig. 1 , Col. 2, 
Lines 58-68); 

cursor position control means (103 CPU) for calculating cursor position 
information based on cursor instructions information input via an input device (i.e. 
key board 111, pointing device 105) and generating and outputting cursor image 
information based on said cursor position information (i.e. since the cursor is 
generated by the lOP 104 and controlled and processed by the CPU 103 when 
user input devices 105 is activated the cursor must be calculated by the CPU in 
order to allow the position value to be properly accessed and utilized for the 
selection of the various windows) (see Fig. 1 , Col. 3, Lines 5-12); 

display image generating means (106) for synthesizing information for the 
plurality of video images output by said video image display control means (103) 
and cursor image information output by the cursor position control means (103) 
and displaying the same on said display device (i.e. since the multiple video 
images are display on the Bit Map Display 108 in forms of multiple windows and 
each of the window can be selected by the cursor to activate its audio content, 
the video images are successfully synthesized and displayed) (see Fig 1 , Col 2, 
Lines 41-68) ; 
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distance information generating means (CPU 103) for calculating 
distances between the display positions of said plurality of video images and a 
cursor display position based on the video image position information for said 
plurality of video images and the cursor position information calculated by said 
cursor position information control means, and generating distance information 
(i.e. since each of the windows are selectable by the cursor to activate its audio 
content the distance is therefore one or zero where if the cursor was found to be 
within the window it is zero and the audio is outputted and when the cursor is 
outside the window then the distance is 1 and no audio is outputted) (see Fig. 1 , 
Col. 3, Lines 5-12); 

and audio output control means (i.e. Audio Selector 102 ) for deciding 
volume of audio data for said plurality of video images based on the distance 
information generated by said distance information generating means, and 
outputting audio data to an output device(i.e. since when the distance is zero, 
where the cursor is found to be inside the window, then the corresponding audio 
of the video in the window is outputted at a preset volume, and when the 
distance is one, where the cursor is not inside the window the volume is zero and 
the audio is not outputted) (see Fig. 1 , Col. 3, Lines 5-1 2). 

However, Okazaki does not explicitly teach wherein said audio output 
control means sets volume of said audio data to one of multiple values so as to 
be in inverse proportion to distance values generated by said distance 
information generating means, synthesizes said audio data corresponding to said 
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plurality of video images displayed by said display image generating means, 
using said respective volumes, and outputs said synthesized audio data. Suzuki 
teaches wherein said audio output control means sets volume (i.e. the db loss 
according to the audio signal) of said audio data (i.e. audio data of the voice 
conversation) to one of multiple values so as to be in inverse proportion to 
distance values generated by said distance information generating means, 
synthesizes said audio data corresponding to said plurality of video images (i.e. 
since each avatar are displayed on the video display 10) (see Fig. 3) displayed 
by said display image generating means (12), using said respective volumes, 
and outputs said synthesized audio data (i.e. the avatar are displayed and the 
voices of the avatar are carried to the user according to the loss of dBs, where 
the farther the avatar is the lighter the volume) (see Suzuki, Fig. 3, 17, Col. 15, 
Lines 31-46). 

Therefore it would have been obvious for one of ordinary skill in the art at 
the time the invention was made to have used the virtual space based audio 
processing system of Suzuki in the computing environment of Okazaki in order to 
provide a virtual space display method which lends realism to the virtual space 
auditorily (see Suzuki Col.1 Lines 53-55). 

As to claim 16, Okazaki teaches a computer-readable recording medium 
(110) storing a program controlling a computer having a display device (108), an 
input device (111) and an audio output device (1 06) to execute a multichannel 
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information processing for displaying a plurality of video images simultaneously 
on the display device (i.e. the multiple windows on the screen each having video 
images displayed coming from the disk which store image data and audio data) 
(see Fig. 4, Col. 4, Lines 14-41), according to operations comprising: 

deciding display positions on the display device for said video images to 
be displayed (i.e. since the computer CPU 103 process the video image data and 
output the data on to a plurality of display windows it necessary decide the 
positions that the video image takes up in order to properly display it on bit map 
display 108) (see Fig. 4, Col. 4, Lines 14-26); 

outputting information for said plurality of video images based on the 
decided display positions (i.e. the CPU 103 sent the video images into the 
various windows on the bitmap display 108 based on the display address 
assigned) (see Fig. 4, Col. 4, Lines 14-34); 

accepting cursor instructions information input from said input device 
(pointing device 105) (i.e. the pointing device 105 inputting the pointer 
information via the lOP 104 to the CPU 103) (see Fig. 4, Col. 4, Lines 47-58); 

calculating cursor position information for displaying a cursor based on 
said cursor instructions information (i.e. the location of the pointer on the bitmap 
display 108 must be calculated to update the operation of the pointing device 
105) (see Fig. 4, Col. 4, Lines 47-58); 
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generating cursor image information based on said cursor position 
information (i.e. tlie image of tlie pointer on tlie bitmap display 108 is outputted 
tlierefore it must be generated after the updates is made on the operations of the 
pointing device 105) (see Fig. 4, Col. 4, Lines 47-58); 

synthesizing information for said plurality of video images and said cursor 
image information, generating a display image, and displaying the display image 
on said display device (i.e. the outputted screen synthesized image data is output 
as an image signal to the bit map display 108 via the D/A converter 305, and 
since the pointer is present is must be synthesized also) (see Fig. 4, Col. 4, Lines 
20-27); 

calculating distances between the display positions of said plurality of 
video images and the display position of said cursor and generating distance 
information (i.e. since the distance of the selected windows is zero (i.e. selection) 
for turning on, when the cursor is outside the parameter of a window it has a 
distance of one (i.e. non-selection), which is calculated based upon this position 
of the windows and the current pointer position) (see Fig. 4, Col. 4, Lines 46-63); 

and deciding volume of audio data for said plurality of video images based 
on said distance information and outputting the audio data to the audio output 
device, wherein the deciding of the volume of the audio data (i.e. the volume of 
the audio of the video is selected based upon the cursor, where the window that 
is selected has the nominal value and the non-selected has a muted volume) 
(see Fig. 4, Col. 4, Lines 46-63). 
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However Okazaki does not explicitly teach setting the volume of said 
audio data for the plurality of video images to one of multiple values in inverse 
proportion to said distances; synthesizing said audio data corresponding to said 
plurality of video images, using said respective volumes; and outputting said 
synthesized audio data to the audio output device. 

Suzuki teaches setting the volume of said audio data for the plurality of 
video images to one of multiple values in inverse proportion to said distances (i.e. 
setting the conversation volume with the dB change according to the distance 
away) (see Fig. 17, Col. 15, Lines 35-50); synthesizing said audio data 
corresponding to said plurality of video images, using said respective volumes 
(i.e. since the conversation are from various terminals and all of the conversation 
can be heard by the user, it is synthesized with a set volume) (see Fig. 2, Col. 5, 
Lines 1-8); and outputting said synthesized audio data to the audio output device 
(i.e. the audio output is sent to speaker) (see Fig. 3, Col. 5, Lines 10-17). 

Therefore, the combination of Okazaki and Suzuki meets the limitation. 

As to claim 19, note the discussion of claim 16 above, claim 19 differs 
form claim 16 only in the limitation of: generating direction information relating to 
direction of display position for each video image as seen from cursor display 
position; and outputting to said audio output device so that audio data 
corresponding to said plurality of video images is positioned at acoustic image 
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positions in the sound space of said audio output device in accordance witli said 
distance information and said direction information. 

OI<azal<i teaches generating direction information relating to direction of 
display position for each video image as seen from cursor display position (i.e. 
since the pointer value that is represented on the display 108 is in two direction x 
and y the video also has those direction since that is how a bitmap display 
receive the data and display it on screen) (see Fig. 4, Col. 4, Lines 46-63); 

and outputting to said audio output device so that audio data 
corresponding to said plurality of video images is positioned at acoustic image 
positions in the sound space of said audio output device (speaker 108) in 
accordance with said distance information and said direction information (i.e. 
since the output the center selected audio data based on the pointer selection, it 
is in the sound space of the speaker as the data values that are outputted there) 
(see Fig. 4, Col. 4, Lines 46-63). 

As to claim 23, note the discussion of claim 16 above, claim 23 is broader 
in scope than claim 16 and is rejected on the same ground. 

As to claim 32, not the discussion of claim 16 above, claim 32 differs from 
claim 16 only in that claim 16 is a method claim and claim 32 is an apparatus 
claim and is regarded as previously discussed with respect to claim 16 above. 
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As to claim 4, OI<azal<i teaches a multichannel information processing 
device according to claim 1, wherein distance information generated by said 
distance information generating means (103) includes direction information 
relating to direction of video image display position as seen from cursor display 
position (i.e. since the windows on the screen have a two dimensional outlay x 
and y by which an input cursor is placed, the direction are accounted for during 
the calculation for the positional information of the window as compared to the 
cursor), and said audio output control means (102) makes output to an audio 
output device based on said distance information, so that audio data for said 
plurality of video images is positioned in the sound space formed by said audio 
output device (i.e. since each of the windows are selectable by the cursor to 
activate its audio content the distance is therefore one or zero where if the cursor 
was found to be within the window it is zero and the audio is outputted and when 
the cursor is outside the window then the distance is 1 and no audio is outputted. 
Also the audio signal of any windows must be in the sound space of the speaker 
109 when it is outputted since it is the only audio output means in the system) 
(see Fig.1 , Col. 3, Lines 5-12). 

As to claim 7, Okazaki teaches multichannel information processing 
device according to claim 1 , further including video image selecting means for 
selecting, based on a prescribed algorithm (i.e. since the video image reside in 
the windows on the screen and the window is selected by the cursor, an 
algorithm is used to determine the window that is selected and therefore the 
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video that is selected), a specified video image from among a plurality of video 
images displayed on said display device, wherein said audio output control 
means outputs to an audio output device audio data for the video image selected 
by said video image selecting means (i.e. since the audio and video signal are 
presented together when the window is selected, both the audio and video are 
output upon selection) (see Fig. 1, Col. 3, Lines 5-12). 

As to claims 9, 12 and 15, these claims differ from claims 1 , 4 and 7 only 
in that claim 1, 4 and 7 are apparatus claims, whereas claim 9, 12 and 15 are 
method claims. Thus, claims 9-12 and 15 are analyzed as previously discussed 
with respect to claims 1-4 and 7 above. 

3. Claims 20-21 , 5 and 1 3 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Okazaki in view of Suzuki as applied to claims 1 , 4, 7, 9, 12, 
15-16, 19, 22-23 and 32, further in view of Yamagami (U.S. Patent 6,334,025). 

As to claim 20, note the discussion of claim 16 above, Okazaki and Suzuki 
does not teach a step for voice-recognizing words included in audio data for said 
plurality of video images; a step for converting voice-recognized words into 
character data and outputting the same; 
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Yamagami teaches a step for voice-recognizing words included in audio 
data for said plurality of video images (i.e. the CPU 13 execute the audio 
recognition causing the result to be displayed in the display section 402); a step 
for converting voice-recognized words into character data and outputting the 
same (i.e. the CPU 13 execute the audio recognition causing the result to be 
displayed in the display section 402) (see Fig. 4, 9, Col. 10, Lines 5-23, Col. 12, 
Line 60 - Col. 13, Line 10); 

Therefore, it would have been obvious for one of ordinary skill in the art at 
the time the invention was made to combine the voice recognition capability of 
Yamagami to the multi-window display of Okazaki in order to allow a more 
efficient storage of the annotation of audio and video data (Yamagami, Col. 1, 
Lines 65-68). 

As to claim 21, note the discussion of claim 16 and claim 20 above, claim 
21 differs from claim 20 only in the addition of two addition steps: 

calculating distance between the display position positions of said plurality 
of video images and said cursor position information and generating distance 
information; 

selecting a specified video image from among the plurality of video images 
based on said distance information and outputting audio data of the selected 
video image to the audio output device; 

Okazaki teaches calculating distances between the display position of said 
plurality of video images and display position of said cursor and generating 
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distances information (i.e. since the distance of the selected windows is zero (i.e. 
selection) for turning on, when the cursor is outside the parameter of a window it 
has a distance of one (i.e. non-selection), which is calculated based upon this 
position of the windows and the current pointer position) (see Fig. 4, Col. 4, Lines 
46-63); selecting a specified video image from among the plurality of video 
images based on said distance information and outputting audio data of the 
selected video image to the audio output device (i.e. the audio data and the video 
data of the selected window is output to the bitmap display 108 and the speaker 
109 based on the distance zero which is when the cursor is actually in the 
window selected) (see Fig. 4, Col. 4, Lines 46-63).; 

As to claim 5, Okazaki teaches a multichannel information processing 
device according to claim 1 , but does not teach voice data recognition means for 
recognizing words included in audio data for said plurality of video images, and 
character information display means for converting words recognized by said 
voice data recognition means into character data and displaying the same on 
said display device. 

Yamagami teaches voice data recognition means (13 CPU) for 
recognizing words included in audio data for said plurality of video images, and 
character information display means (13 CPU) for converting words recognized 
by said voice data recognition means into character data and displaying the 
same on said display device (i.e. the CPU 13 execute the audio recognition 
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causing the result to be displayed in the display section 402) (see Fig. 4, 9, Col. 

10, Lines 5-23, Col. 12, Line 60 - Col. 13, Line 10). 

Therefore, it would have been obvious for one of ordinary skill in the art at 
the time the invention was made to combine the voice recognition capability of 
Yamagami to the multi-window display of Okazaki. 

As to claim 13, this claim differs from claim 5 only in that claim 5 is an 
apparatus claim, whereas claim 13 is a method claim. Thus, claim 13 is 
analyzed as previously discussed with respect to claim 5 above. 

4. Claims 22, 6, and 14 are rejected under 35 U.S.C. 103(a) as being 

unpatentable over Okazaki in view of Suzuki and further in view of Yamagami as 
applied to claims 20-21 , 5 and 13 above, and further in view of Hilpert, Jr. et al. 
(U.S. Patent 6,469,71 2) 

As for claim 22, note the discussion of claim 20 above, Yamagami does 
not explicitly teach a step for connecting to the Internet; a step for searching for 
related web sites on the Internet using a voice-recognized word as keyword; 

Hilpert teaches a step for connecting to the Internet (i.e. Internet access 
program); a step for searching for related web sites (i.e. Net Search) on the 
Internet using key word (i.e. since the web browser is an extension to the 
conventional process data capability of an individual personal computer, it is 
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natural to use the web search capability to enhance the operation of the 
computer) (Col. 3, Line 50 - Col. 4, Line 50). 

Therefore, it would have been obvious for one of ordinary skill in the art at 
the time the invention was made to combine the web searching capability of 
Hilpert to the text recognition design of Yamagami, (i.e. having the text 
recognition process to be further enhanced by use of Internet search for more 
detailed information) in order to provide the user additional interactions with the 
images displayed to assist visually impaired users (Hilpert Col. 1 , Lines 50-66). 

As to claim 6, note the discussion of Claim 5, Yamagami does not 
explicitly teach Internet connection means, web site search means for searching 
for related web sites on the Internet and web site display means for displaying on 
said display device a web site found by said web site search means. 

Hilpert teaches Internet connection means (i.e. Internet access program), 
web site search means (i.e. Net Search) for searching for related web sites on 
the Internet and web site display means for displaying on said display device a 
web site found by said web site search means (i.e. since the web browser is an 
extension to the conventional process data capability of an individual personal 
computer, it is natural to use the web search capability to enhance the operation 
of the computer) (Col. 3, Line 50 - Col. 4, Line 50). 

Therefore, it would have been obvious for one of ordinary skill in the art at 
the time the invention was made to combine the web searching capability of 
Hilpert to the text recognition design of Yamagami (i.e. having the text 
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recognition process to be further enhanced by use of Internet search for more 
detailed information). 

As to claim 14, this claim differs from claim 6 only in that claim 6 is an 
apparatus claim, whereas claim 14 is a method claim. Thus, claim 14 is 
analyzed as previously discussed with respect to claim 6 above. 

5. Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Okazaki In view of Suzuki and further in view of Tarabella (U.S. Patent 
5,796,945). 

As to claim 8, note the discussion of Claim 7, Okazaki does not explicitly 
teach video image selecting means switches to a different video image for 
selection whenever a prescribed length of time has passed. 

Tarabella teaches video image selecting means switches to a different 
video image for selection whenever a prescribed length of time has passed (i.e. 
the video clip that is capable of being displayed can be controlled for the length 
of time that it is to be displayed before a change is to take place (see Col. 5, 
Lines 1-53). 

Therefore, it would have been obvious for one of ordinary skill in the art at 
the time the invention was made to combine the time based pre-set selection 
capability of Tarabella to the image selection system of Okazaki, in order to make 
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the computer display more productive during idle time (see Tarabella, Col. 2, 
Lines 5-13). 

Response to Arguments 

6. Applicant's arguments with respect to claim have been considered but are 
moot In view of the new ground(s) of rejection. 



Conclusion 

7. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of 
time policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire 
THREE MONTHS from the mailing date of this action. In the event a first reply is 
filed within TWO MONTHS of the mailing date of this final action and the advisory 
action is not mailed until after the end of the THREE-MONTH shortened statutory 
period, then the shortened statutory period will expire on the date the advisory 
action is mailed, and any extension fee pursuant to 37 CFR 1 .1 36(a) will be 
calculated from the mailing date of the advisory action. In no event, however, will 
the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 



Inquiry 
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Any inquiry concerning this communication or earlier communications from 
tlie examiner sliould be directed to CALVIN C. MA wlnose teleplione number is 
(571)270-1713. Tine examiner can normally be reached on 7:30-5:00. 

If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Chanh Nguyen can be reached on 571-272-7772. The 
fax phone number for the organization where this application or proceeding is 
assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). If you would like assistance from a USPTO Customer Service 
Representative or access to the automated information system, call 800-786- 
9199 (IN USA OR CANADA) or 571-272-1000. 

/Chanh Nguyen/ 

Supervisory Patent Examiner, Art 
Unit 2629 

Calvin Ma 
May 22, 2008 



